Valence extraction using EM selection and co-occurrence matrices

نویسنده

  • Lukasz Debowski
چکیده

This paper discusses two new procedures for extracting verb valences from raw texts, with an application to the Polish language. The first novel technique, the EM selection algorithm, performs unsupervised disambiguation of valence frame forests, obtained by applying a non-probabilistic deep grammar parser and some post-processing to the text. The second new idea concerns filtering of incorrect frames detected in the parsed text and is motivated by an observation that verbs which take similar arguments tend to have similar frames. This phenomenon is described in terms of newly introduced co-occurrence matrices. Using co-occurrence matrices, we split filtering into two steps. The list of valid arguments is first determined for each verb, whereas the pattern according to which the arguments are combined into frames is computed in the following stage. Our best extracted dictionary reaches an $F$-score of 45%, compared to an $F$-score of 39% for the standard frame-based BHT filtering.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Textural Feature Extraction and Classification of Mammogram Images using CCCM and PNN

This work presents and investigates the discriminatory capability of contourlet coefficient cooccurrence matrix features in the analysis of mammogram images and its classification. It has been revealed that contourlet transform has a remarkable potential for analysis of images representing smooth contours and fine geometrical structures, thus suitable for textural details. Initially the ROI (Re...

متن کامل

بررسی اثر ناسازگاری ماتریس های واریانس- کواریانس در شاخص انتخاب

In selection index procedure, phenotype and genetic (co)variance matrices of traits are used for calculating different genetic parameters like index coefficients, index variance, genetic gain in selection goal and selection accuracy. Sometimes, it is possible that these matrices become inconsistent or they are not positive, nor definite. In the current study, for investigation of the effect of ...

متن کامل

بررسی اثر ناسازگاری ماتریس های واریانس- کواریانس در شاخص انتخاب

In selection index procedure, phenotype and genetic (co)variance matrices of traits are used for calculating different genetic parameters like index coefficients, index variance, genetic gain in selection goal and selection accuracy. Sometimes, it is possible that these matrices become inconsistent or they are not positive, nor definite. In the current study, for investigation of the effect of ...

متن کامل

Implementing Texture Feature Extraction Algorithms on FPGA

Faculty of Electrical Engineering, Mathematics and Computer Science CE-MS-2009-25 Feature extraction is a key function in various image processing applications. A feature is an image characteristic that can capture certain visual property of the image. Texture is an important feature of many image types, which is the pattern of information or arrangement of the structure found in a picture. Tex...

متن کامل

Statistical Feature Selection for Image Texture Analysis

Texture is one of the visual features used in Content Based Image Retrieval (CBIR) to represent the contents of the image with respect to the characteristics brightness, color, shape, size, etc. Texture is a property that represents spatial distribution of an Image. Texture can be defined as a repetition of an element or pattern in a problem space. Texture analysis can be used for classificatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Language Resources and Evaluation

دوره 43  شماره 

صفحات  -

تاریخ انتشار 2009